Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement custom loading of pybindings to replace imp.load_dynamic #131

Merged
merged 1 commit into from
Feb 2, 2024

Conversation

cnweaver
Copy link
Contributor

@cnweaver cnweaver commented Dec 8, 2023

Write our own dynamic loading and module initialization code which does not depend on the imp module. This breaks the dilemma between the imp module being removed in python 3.12 and not being able to load the python bindings via the standard import mechanism because the library names do not conform to python's non-standard expectations.

An SPT3G_PYTHON_MODULE macro is introduced to wrap BOOST_PYTHON_MODULE. This is needed to work around an awkward detail of how python modules are initialized:

This is a bit of a hack: when the shared library is loaded,
the module name is "package.module", but the module calls
PyModule_Create*() with just "module" for the name. The shared
library loader squirrels away the true name of the module in
_PyRuntime.imports.pkgcontext, and PyModule_Create*() will
substitute this (if the name actually matches).

This causes problems because each class created within a module takes the module object's name to form its own fully-qualified name, so for those to be correct, we need to patch the modules __name__ after it is created (by PyModule_Create/Py_InitModule), but before any of its own initialization code starts running. The new wrapper macro allows us to insert the fix-up code which does this. (It would be elegant to use the same interface that python uses internally to have PyModule_Create/Py_InitModule just do the right thing for us, but that interface is private and has not been stable across versions.)

This has been tested on python 2.7, 3.7, 3.8, and 3.12.
There is one test failure on python 3.12, however, it appears to be unrelated:

39/48 Test #39: maps/transtest .............................***Failed    1.76 sec
Traceback (most recent call last):
  File "/Users/christopher/Work/CMB-S4/spt3g_software-public-git/maps/tests/transtest.py", line 32, in <module>
    g = c.transform_to(astropy.coordinates.Galactic)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/christopher/.local/lib/python3.12/site-packages/astropy/coordinates/baseframe.py", line 1272, in transform_to
    raise ConvertError(msg.format(self.__class__, new_frame.__class__))
astropy.coordinates.errors.ConvertError: Cannot transform from <class 'astropy.coordinates.builtin_frames.fk5.FK5'> to <class 'abc.ABCMeta'>

as the problem can be reproduced by a reduced test-case which does not involve spt3g_software, suggesting that the cause is an astropy bug:

import numpy as np
import astropy.coordinates
from astropy import units as u

np.random.seed(42)
n_samps = int(1e3)

ra_0 = np.random.rand(n_samps)*2.0*np.pi - np.pi
pole_avoidance = 0.7
dec_0 = np.random.rand(n_samps)*np.pi*pole_avoidance-np.pi/2.0*pole_avoidance

c = astropy.coordinates.FK5(ra=np.asarray(ra_0)*u.rad, dec=np.asarray(dec_0)*u.rad)
g = c.transform_to(astropy.coordinates.Galactic)

There are also various test failures on python 2.7, but they appear to involve various cases of invalid (too-new) syntax in existing scripts, but other scripts without those issues work, and most tests pass.

Addresses #127

@cnweaver cnweaver added the bug label Dec 8, 2023
@cnweaver cnweaver self-assigned this Dec 8, 2023
Copy link
Member

@arahlin arahlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to continue limping along with python2 support? I'd rather just drop it wholesale at this point, to be honest.

@cnweaver
Copy link
Contributor Author

cnweaver commented Dec 8, 2023

There are a number of places where people have already broken python 2 compatibility (although I think all would be fixable if there were a reason to care); I mostly just wanted to make sure I wouldn't create an unnecessary limitation in this added code. The portion of this code needed to support it is a fairly small fraction of the whole, and aside from removing some #ifdef blocks, removing it wouldn't provide much simplification.

@arahlin
Copy link
Member

arahlin commented Dec 8, 2023

There is one test failure on python 3.12, however, it appears to be unrelated:

I've fixed this on the master branch. Newer versions of astropy expect the transform_to argument to be an initialized instance of the coordinate class.

@arahlin
Copy link
Member

arahlin commented Dec 24, 2023

Is there a particular reason we don't name the libraries using the standard convention? Seems like an easy solution would be to name the libraries the way that python expects...

@cnweaver
Copy link
Contributor Author

That requires breaking with the conventions expected by everything else for library naming, which make it annoying to link anything else with them. The reason this situation is weirder/harder for us than for most anyone else is that combining python bindings into the same library which is used when python is not (directly) involved is highly atypical.

@arahlin
Copy link
Member

arahlin commented Dec 24, 2023

Can we use symlinks for one or the other?

@arahlin
Copy link
Member

arahlin commented Jan 30, 2024

Ping here, it sounds like people are starting to run into this issue more often with updated software. Is this PR ok to merge?

@cnweaver
Copy link
Contributor Author

cnweaver commented Feb 2, 2024

Sorry about the delay, I was trying to do a bit more checking of the other approach with symlinks to present the same libraries with two different names. That seems to fundamentally work, but there's some detail that I can't quite get right. A few notes in case we ever want to re-visit it:

The main change is to install a symlink with an ugly name (e.g. _core.so for core) in the python module directory, instead of the actual library:

--- a/cmake/Spt3gIncludes.cmake.in
+++ b/cmake/Spt3gIncludes.cmake.in
@@ -61,7 +61,9 @@ macro(add_spt3g_library lib_name)
        list(APPEND SPT3G_LIBRARIES ${lib_name})
        set(SPT3G_LIBRARIES ${SPT3G_LIBRARIES} PARENT_SCOPE)
        install(TARGETS ${lib_name} EXPORT ${PROJECT_NAME}Config LIBRARY DESTINATION "${CMAKE_INSTALL_LIBDIR}")
-       install(TARGETS ${lib_name} DESTINATION ${PYTHON_MODULE_DIR}/spt3g)
+       INSTALL(CODE "execute_process(COMMAND ${CMAKE_COMMAND} -E create_symlink \
+                     ${CMAKE_INSTALL_PREFIX}/${CMAKE_INSTALL_LIBDIR}/${CMAKE_SHARED_LIBRARY_PREFIX}${lib_name}${CMAKE_SHARED_LIBRARY_SUFFIX} \
+                     ${PYTHON_MODULE_DIR}/spt3g/_${lib_name}.so)")
 endmacro(add_spt3g_library lib_name)

The leading underscore (or some equivalent mangling) is important for an import statement to be able to distinguish the compiled module from the python portion. Doing this requires changing each compiled module's name to match, i.e. tacking on a leading underscore. Each python __init__ script must be then changed to import the compiled module with its mangled name:

--- a/core/python/__init__.py
+++ b/core/python/__init__.py
@@ -1,5 +1,4 @@
-from spt3g.core.load_pybindings import load_pybindings
-load_pybindings(__name__, __path__)
+from spt3g._core import *

In testing, this works, except for importing the modules with compiled components besides core if the working driectory is the CMake build directory. My best guess is that this is because of either witchcraft or dynamic linker search paths, and I have not been able to find a way to fix it.

At any rate, it seems like we should be able to re-evaluate or change how we do this in future if we want, so I think it's safe to merge the custom loader version for now.

@cnweaver cnweaver merged commit a9d32d5 into master Feb 2, 2024
1 check passed
@cnweaver cnweaver deleted the python312-pybindings-loading branch February 2, 2024 19:46
@arahlin
Copy link
Member

arahlin commented Feb 2, 2024

Sure sounds like witchcraft to me! Yeah, I think it would be nicer in the long run if we didn't have to roll our own loader, so let's revisit at some point. What about if you used a relative import instead, e.g. from .._core import *, so that the global search path doesn't come into play?

@arahlin arahlin mentioned this pull request Feb 2, 2024
@cnweaver
Copy link
Contributor Author

cnweaver commented Feb 2, 2024

Unfortunately, the relative import yields the same error. It seems like it is finding the symlink (one would hope for a different error if not), but it's adamant that the symbol isn't in the library, even though nm says it is.

@arahlin
Copy link
Member

arahlin commented Feb 2, 2024

Unfortunately, the relative import yields the same error. It seems like it is finding the symlink (one would hope for a different error if not), but it's adamant that the symbol isn't in the library, even though nm says it is.

Gremlins. Ok, some more head-scratching for later, this solution is fine for the time being. Thanks for putting in the work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants